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I.  INTRODUCTION 


The  study  of  the  effects  of  unusual  environments  on  individuals  often 
entails  the  analysis  of  repeated  measurements  taken  on  a  single  subject. 
Unfortunately,  except  under  very  restrictive  sets  of  assumptions,  no  valid 
statistical  techniques  have  been  developed  for  such  an  analysis.  In  par¬ 


ticular,  for  the  first-order  autoregressive  process  (AR(1)),  no  inference 
procedure  is  currently  available  which  enables  the  analyst  to  control  the 
probability^of  making  an  invalid  conclusion.  This  problem  is  particularly 
acute  when  only  a  relatively  small  number  of  observations  are  available  for  .  • 

fW  test 

the  analysis of  euu.fot*yt s'?'*  Z 

The  purpose  of  this  investigation  is  to  evaluate  some  of  the  procedures 
which  have  been  suggested  for  this  situation.^  Of  particular  interest  is  the 
difference  between  the  nominal  error  rate  chosen  by  the  experimenter  and  the 
actual  error  rate  given  by  the  procedure.  It  is  also  desirable  to  evaluate 

how  this  difference  is  affected  by  the  number  of  observations  used  in  the 

- - - , 

analysis.  " 

In  a  previous  Desmatics  technical  report  [1],  Burns  and  Smith  discussed 
the  problem  of  testing  hypotheses  about  the  mean  of  an  AR(1).  Among  the  test 
statistics  considered  in  that  investigation  were  a  modified  t  statistic, 
originally  proposed  by  Higgins  [2],  and  a  more  standard  technique  which  in¬ 
volves  transforming  the  observed  data  and  treating  the  transformed  observa¬ 
tions  as  an  Independent  sample.  These  two  testing  procedures  are  investi¬ 
gated  more  extensively  in  this  report.  In  addition,  both  of  these  procedures 
require  an  estimate  of  the  autocorrelation,  and  two  such  estimates  are  con¬ 
sidered  here.  The  first  is  the  standard  estimate,  as  given  in  any  elementary 
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statistics  text,  while  the  second  estimator  includes  a  correction  for  bias. 
Thus,  four  procedures  in  all  are  considered  in  this  investigation. 

It  should  be  noted  here  that  the  standard  t  statistic,  which  gives  a 
valid  test  procedure  only  when  the  observations  are  independent,  is  not  in¬ 
cluded  in  this  Investigation.  The  principal  reason  for  this  omission  is  that 
this  procedure  performs  substantially  worse,  for  any  size  sample,  than  any 
of  the  procedures  which  are  considered.  (This  was  shown  clearly  in  the 
previous  technical  report.)  Furthermore,  since  both  of  the  estimators  used 
here  for  the  autocorrelation  are  consistent,  the  four  procedures  being  con¬ 
sidered  are  at  least  asymptotically  valid,  while  the  standard  t  statistic 
gives  very  poor  results  even  in  the  asymptotic  case. 
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II.  STATISTICAL 
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The  problem  considered  here  Is  that  of  testing  hypotheses  sheet  the 
mean  of  a  first-order  autoregressive  process.  The  model  may  be  specified 
as  follows: 


Yl“w"el  * 


where 


?t“U  +£t»  t«2,3,...,n 

£l-N(0, o2/(1  -p2)). 
et~N(0,o2)  for  t>^2  and 
e1»e2*«**»en  are  independent. 

Thus,  the  correlation  between  any  two  observations  Y^  and  Y^  is  p^”^  . 
Also,  Yt~N(y,o2)  for  all  t  where  a2  ■  02/(l  -  p2)  . 

As  in  the  earlier  technical  report,  attention  is  restricted  to  testing 
the  hypothesis  H^:  y  ■>  0  vs.  H^j  y  >0.  As  mentioned  in  that  report,  pro¬ 
cedures  used  for  testing  this  hypothesis  may  easily  be  extended  to  more  com¬ 
plicated  situations,  such  as  tests  concerning  an  intervention  effect. 

If  p*0,  the  problem  given  above  reduces  to  that  of  testing  whether 
y  •  0  in  a  normal  distribution.  The  observations  are  independent  and  the 
appropriate  test  statistic  is  T ■  /n  Y/s,  which  follows  Student's  t  distri¬ 
bution  with  n-1  d.f.  When  p>0,  use  of  this  statistic  leads  to  a  seriously 

inflated  type  I  error  rate.  (See,  for  example,  [1]  or  [2].)  This  inflated 

2 

error  rate  is  primarily  a  result  of  the  fact  that  s  underestimates  the 
variance  of  Y,  which  is  approximately  (o2/n)  (-j^)  . 

This  approximation  led  Higgins  to  suggest  the  modified  statistic 


TC  ■  (-j^)  T  ,  where  p  is  an  appropriate  estimate  of  the  autocorrelation. 

Another  alternative  test  statistic  may  be  obtained  by  considering  the  trans- 
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formation  Z  «*Y  -  pY  .  .  When  p  is  known ,  this  transformation  yields  a 

t  fc  t"JL 

set  of  n-1  independent  observations  from  a  normal  distribution  with  mean 
v  ■  (l-p)p  •  Since  v  ■  0  when  p  ■  0  and  v  >  0  when  p  >  0,  the  hypothesis 
HqS  v*0  vs.  H^:  v>0  is  equivalent  to  the  original  hypothesis.  The  ap¬ 
propriate  test  statistic  is  TR  ■  /n-1  Z/s^  .  Of  course,  in  practical  situ¬ 
ations,  p  is  not  known.  Therefore,  some  estimate  of  the  autocorrelation, 

0,  must  be  used  for  the  transformation.  Obviously,  this  procedure  will  only 
be  as  good  as  the  estimate  used. 

The  standard  estimate  of  the  first-order  serial  correlation  is: 

0,  •{  E  [<*,  -  Y)  ]/  I(Y  -Y)2  }  . 

i-1  A  1  i»l  1 

Unfortunately,  this  estimate  of  the  autocorrelation  is  biased,  especially 
when  the  number  of  observations  is  small.  A  less  biased  estimate,  which  is 
derived  in  [3],  is  *  t  (n-1)^  +  ll/(n-4)  .  Since  the  range  of  ftj.  is  [-1,1], 
it  is  possible  for  to  have  inadmissable  values  (values  outside  of  [-1,1]), 
particularly  when  n  is  small.  In  that  case,  ^  “  1  (or  -1)  is  used. 

As  mentioned  earlier,  four  different  test  statistics  are  included  in 
this  investigation.  Two  of  these  statistics,  denoted  TCI  and  TC2,  are  ob¬ 
tained  by  using  0^  and  respectively,  as  the  estimate  of  the  autocorrela¬ 
tion  when  calculating  TC.  (TCI  is  the  test  statistic  studied  by  Higgins.) 

The  other  two  statistics,  denoted  TR1  and  TR2,  are  calculated  by  using  0^ 
and  $2  to  transform  the  data  and  proceeding  as  described  for  TR  above. 

Five  different  autocorrelations  (p  ■  .5,  .7,  .8,  .85,  .9)  and  five  sample 
sizes  (n*10,  20  ,  30  ,  50,  100)  were  used  in  this  investigation.  For  each 
value  of  (p,n),  1000  samples  were  simulated  and  the  four  test  statistics  cal¬ 
culated  for  each  sample.  From  the  1000  simulations,  the  empirical  distri¬ 
bution  function  was  found  for  each  test  statistic.  Finally,  using  t  (n-1) 


as  the  critical  value,  the  empirical  significance  level  was  found  using  each 
of  three  different  nominal  significance  levels  (a  *.05,  .025,  .01)  .  The 
values  obtained  in  this  way  are  given  in  Tables  1  through  5.  (The  predicted 
values,  which  are  also  given  in  the  tables,  will  be  discussed  later.) 

It  should  be  noted  here  that  the  observed  empirical  significance  levels 
presented  in  Tables  1  through  5  are  only  estimates  of  the  significance  levels 
which  are  obtained  when  using  the  procedures  being  discussed  in  this  report. 
The  variability  of  these  estimates  may  be  calculated  by  considering  the  method 
by  which  they  were  obtained.  For  each  test  statistic,  the  empirical  signifi¬ 
cance  levels  were  calculated  by  counting  the  number  of  times,  out  of  1000  sim¬ 
ulations,  the  statistic  exceeded  a  specified  critical  value.  This  quantity 
is  a  random  variable  having  a  binomial  distribution.  Therefore,  if  p  is  the 

true  probability  of  exceeding  the  critical  value,  the  standard  deviation  of 

u 

this  random  variable  is  [p(l-p) /1000]  .  If  p».05,  for  example,  the  standard 
deviation  is  .0069  and  a  952  confidence  interval  for  the  empirical  signifi¬ 
cance  level  is  (.036,  .064)  .  (The  normal  approximation  to  the  binomial  dis¬ 
tribution  is  used  here  to  compute  the  confidence  interval.  This  is  nearly 
exact  for  n-1000. ) 

The  actual  significance  levels  for  each  test  statistic  are  expected  to 
be  monotone  decreasing  functions  of  sample  size,  since  the  precision  of  the 
estimate  of  p  increases  as  the  sample  size  increases.  The  fact  that  the 
values  gj.ven  in  the  table  do  not  always  follow  this  pattern  is  attributable 
to  the  statistical  variation  described  above.  For  p«.5,  in  particular,  the 
observed  values  tend  to  behave  erratically.  For  larger  values  of  p,  the 
actual  change  in  the  significance  level  as  a  function  of  sample  size  is 
large  enough  to  overwhelm  any  small  fluctuations  due  to  statistical  varia¬ 
bility. 
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Since  the  observed  empirical  significance  levels  behave  somewhat 
erratically  as  a  function  of  sample  size,  it  is  difficult  to  determine 


how  best  to  interpolate  between  the  values  in  the  tables,  or  how  a  given 
value  should  perhaps  be  adjusted  after  consideration  of  the  values  nearest 
it.  It  was  decided  that  the  best  way  to  accomplish  both  purposes  was  to 
fit  a  function  to  each  set  of  five  observed  significance  levels.  (The 
five  values  are  for  the  five  sample  sizes  considered  with  both  p  and  a 
fixed.)  Since  the  values  in  Tables  1  through  5  appear  to  decrease  approxi¬ 
mately  exponentially  as  functions  of  sample  size,  functions  of  that  type 
were  first  considered.  It  was  finally  decided,  however,  that  in  order  to 
achieve  Increased  flexibility,  gamma  functions  should  be  fit  to  the  data. 


These  are  functions  of  the  form: 

0.  B,n 

T(n)  -6Qn  e  1  * 

These  are  monotone  decreasing  functions  as  long  as  Bj  *' 0  and  02<O.  In 
order  for  the  functions  to  be  asymptotically  equal  to  the  nominal  signifi¬ 
cance  level,  functions  of  the  form  a  +  T(n)  were  actually  fit  to  the  data. 

Least  squares  regression,  applied  to  the  log  transformation  of  the 
data,  was  used  to  fit  the  functions  given  above.  That  is,  the  functions 


actually  fit  were  of  the  form: 


In  (y-a)  -  In  0Q  +  ®1  ln  n  +  82n» 

where  y  is  the  observed  significance  level.  Unfortunately,  many  (22  of  60) 
of  the  functions  fit  in  this  way  did  not  satisfy  the  restriction  that 
B2<  0.  In  those  cases,  the  functions: 

ln  (y-a)  -  ln  0Q  +  In n 

were  used.  In  all  of  the  cases  where  they  were  used,  the  simple  functions 
gave  almost  as  good  a  fit  as  did  the  full  gamma  functions.  (Simple  ex¬ 
ponential  functions  were  also  considered,  but  did  not  fit  the  data  well.) 
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The  fitted  functions  have  been  plotted  and  are  presented  in  Figures  1 
through  15.  The  predicted  values  for  each  function  have  also  been  calcu¬ 
lated  for  each  of  the  sample  sizes  used  in  the  regressions.  These  predicted 
values  are  listed  in  Tables  1  through  5  so  that  they  may  be  compared  to 
the  observed  values. 

Comparison  of  the  observed  and  predicted  values  in  the  tables  show 
that  in  most  cases  the  functions  fit  the  data  remarkably  well.  The  ex¬ 
ception  to  this  occurs  when  p  -  .5,  particularly  for  n-10.  However,  as 
mentioned  earlier,  the  observed  values  exhibit  rather  erratic  behavior 
when  p  »  .5  and  the  functions  cannot  be  expected  to  fit  well  in  this  situ¬ 
ation.  Another  fact  which  should  be  noted  is  that  the  predicted  and  ob¬ 
served  values  are  closest  for  large  n.  This  is  to  be  expected  since  the 
fitting  was  done  on  the  log  scale,  which  gives  added  weight  to  small  values. 

From  Figures  1  through  15,  it  is  clear  that  the  two  test  statistics 
using  02  Perform  substantially  better,  for  all  values  of  p,  than  the  test 
statistics  which  use  .  Furthermore,  there  is  little  difference  between 
TR  and  TC,  using  either  estimate  of  p,  although  TCI  generally  does  slightly 
better  than  TR1  and  TR2  generally  does  slightly  better  than  TC2.  As  could 
be  expected,  all  of  the  test  statistics  perfotm  better  for  moderate  auto¬ 
correlations  than  they  do  when  the  autocorrelation  is  very  high.  For  TR2, 
for  example,  with  nominal  a*  .05  and  p»  .5,  a  sample  size  of  about  12  is 
needed  to  obtain  an  estimated  significance  level  of  .075.  When  P«.9»  a 
sample  size  of  100  gives  the  same  estimated  level  of  significance. 

As  an  example  of  how  Figures  1  through  15  might  be  used,  suppose  that 
20  repeated  measurements  are  taken  on  an  individual  and  that  those  measure¬ 
ments  are  aaaumad  to  follow  a  first-order  autoregressive  process.  From  the 

estlmeted  autocorrelation,  in  conjunction  with  any  prior  information,  the 
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experimenter  decides  that  p  Is  in  the  Interval  (.7,  .9)  .  Now  suppose 
TR2  is  used  to  test  whether  the  mean  of  the  process  Is  zero  with  nominal 
<*■  .01.  From  Figures  6  and  15  one  can  obtain  rough  bounds  on  the  actual 
significance  level.  In  this  case,  the  bounds  are  (.040,  .087)  . 
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III.  SUM1ARY 


Four  test  statistics  have  been  considered  as  candidates  for  testing 
HqJ  y  »  0  vs,  H^:  y>0  when  the  observations  are  taken  from  an  AR(1)  with 
autocorrelation  p.  A  set  of  1000  samples  was  generated  for  each  of  five 
different  sample  sizes  (n*10,  20,  30,  50,  100)  and  five  different  autocor¬ 
relations  (p*.5,  .7,  .8,  .85,  .9).  From  the  1000  simulations,  the  empirical 
distribution  functions  were  calculated  for  each  test  statistic.  Finally, 
using  t^Cn-l)  as  the  critical  value  corresponding  to  a  specified  nominal 
significance  level,  the  empirical  significance  levels  were  found  and  tabu¬ 
lated. 

Since  these  empirical  significance  levels  were  found  to  fluctuate 
erratically  due  to  statistical  variation,  smoothing  functions  were  fit  to 
the  five  values  for  each  combination  of  p,  a,  and  test  statistic.  These 
functions  are  also  an  aid  in  interpolation  between  sample  sizes.  The  func¬ 
tions  are  presented  graphically  and  an  example  given  as  to  how  they  might  be 
used  in  practice. 
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Table  1:  Comparison  of  Observed  Empirical  Significance  Levels  With  Those  Predicted 
by  Smoothing  Function  for  Four  Test  Statistics:  p  ■  .5  . 
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Smoothing  Function  for  Four  Test  Statistics 
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Table  5:  Comparison  of  Observed  Empirical  Significance  Levels  With  Those  Predicted  by 
Smoothing  Function  for  Four  Test  Statistics:  p  -  .9  . 
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Figure  1:  Empirical  Significance  Level  as  a  Function 
of  Sample  Size  for  p—.5  and  Nominal  a=.05 


Figure  2:  Empirical  Significance  Level  as  a  Function 
of  Sample  Size  for  p~. 5  and  Nominal  a=.025 


Figure  4:  Empirical  Significance  Level  as  a  Function 
of  Sample  Size  for  p=.7  and  Nominal  a=.05 


Empirical  Significance  Level  as  a  Functio: 
of  Sample  Size  for  p=. 7  and  Nominal  a=.0i 


075 


Empirical  Significance  Level  as  a  Function 
>f  Sample  Size  for  p=.8  and  Nominal  a =.05 


Empirical  Significance  Level  as  a  Functio: 
f  SamDle  Size  for  p=. 8  and  Nominal  a=.0! 


Empirical  Significance  Level  as  a  Function 
>f  Sample  Size  for  p=.8  and  Nominal  a=.01 


Figure  10:  Empirical  Significance  Level  as  a  Function 
of  Sample  Size  for  p=. 85  and  Nominal  a =.05 


Figure  12:  Empirical  Significance  Level  as  a  Function 
of  Sample  Size  for  p=. 85  and  Nominal  a=.01 


Figure  13:  Empirical  Significance  Level  as  a  Function 
of  Sample  Size  for  p=. 9  and  Nominal  a=.05 


;  Empirical  Significance  Level  a*  a  Function 
of  Sample  Size  for  p=.9  and  Nominal  a=.025 
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